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Abstract 

The Brownian motion (f/ t ' V )t>o on the unitary group converges, as a process, to the free unitary 
Brownian motion (ut)t> o as N —> oo. In this paper, we prove that it converges strongly as a process: 
not only in distribution but also in operator norm. In particular, for a fixed time t > 0, we prove 
that the spectral measure has a hard edge: there are no outlier eigenvalues in the limit. We also prove 
an extension theorem: any strongly convergent collection of random matrix ensembles independent 
from a unitary Brownian motion also converge strongly jointly with the Brownian motion. We give 
an application of this strong convergence to the Jacobi process. 
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1 Introduction 

This paper is concerned with convergence of the noncommutative distribution of the standard Brown¬ 
ian motion on unitary groups. Let MU denote the space of N x N complex matrices, and let tr = -^Tr 
denote the normalized trace. We denote the unitary group in MU as Ujv- The Brownian motion on U ; .y 
is the diffusion process (U f N ) t >o started at the identity with infinitesimal generator ^ Arj^, where Aii. v 
is the left-invariant Laplacian on ILU. (This is uniquely defined up to a choice of IV-dependent scale; 
see Section [2d] for precise definitions, notation, and discussion.) 

For each fixed t > 0, Ujf is a random unitary matrix, whose spectrum spec (Ujf) consists of N 
eigenvalues \i(U.'£*), ..., Xn(U^). The empirical spectral distribution, also known as the empirical law of 
eigenvalues, of U t N (for a fixed t > 0) is the random probability measure Law^jv on the unit circle Ui 
that puts equal mass on each eigenvalue (counted according to multiplicity): 

1 N 

Law V N = 6 *i(u t N y 

3 =1 

In other words: Law-jA- is the random measure determined by the characterization that its integral 
against test functions / G C(Ui) is given by 



/ dLaw^jv 


1 N 


(i.i) 


In (5], Biane showed that this random measure converges weakly almost surely to a deterministic limit 
probability measure u t 


lim [ f dL&WjjN = [ fduta.s. f G C(Ui). (1.2) 

N ^°° Jui * J Ui 

The measure ut can be described as the spectral measure of a free unitary Brownian motion (cf. Section 
|2.4) . For t > 0, ui possesses a continuous density that is symmetric about 1 E Ui, and is supported on 
an arc strictly contained in the circle for 0 < t < 4; for t > 4, supp ft = Ui. 

The result of ( |1.2| is a bulk result: it does not constrain the behavior of individual eigenvalues. The 
additive counterpart is the classical Wigner's semicircle law. Let X N is a Gaussian unitary ensemble 
(GUE V ), meaning that the joint density of entries of X N is proportional to exp (—^Tr(A 2 )). Alterna¬ 
tively, X N may be described as a Gaussian Wigner matrix, meaning it is Hermitian, and otherwise has 
i.i.d. centered Gaussian entries of variance fj. Wigner's law states that the empirical spectral distribu¬ 
tion converges weakly almost surely to a limit: the semicircle distribution ^ ^/(4 — x 2 ) + dx, supported 
on [—2, 2] (cf. I5DI ). This holds for all Wigner matrices, independent of the distribution of the entries, 
cf. |21. But this does not imply that the spectrum of X N converges almost surely to [—2,2]; indeed, it is 
known that this "hard edge" phenomenon occurs iff the fourth moments of the entries of X N are finite 

(cf. El). 
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Our first major theorem is a "hard edge" result for the empirical law of eigenvalues of the Brow¬ 
nian motion Df. Since the spectrum is contained in the circle Ui instead of discussion the "largest" 
eigenvalue, we characterize convergence in terms of Hausdorff distance du'. the Hausdorff distance 
between two compact subsets A, B of a metric space is defined to be 

d}j{A 1 B ) = inf{e > 0: d C B £ & B C A e }, 


where A e is the set of points within distance e of A. It is easy to check that the "hard edge" theorem for 
Wigner ensembles is equivalent to the statement that d/f(spec(X x ). [—2, 2]) —)• 0 a.s. as N —> oo; for a 
related discussion, see Corollary |3.3| and Rem ark |3.4| below. 

Theorem 1.1. Let N G N, and let (U^)t >o be a Brownian motion on U,y. Fix t > 0. Denote by v t the law of 
the free unitary Brownian motion, cf. Theorem |Z5] Then 

du (spec(t// v ), supp vf) —> 0 a.s. as N —>• oo. 


Remark 1.2. When t > 4, sup]) zy = Ui, and Theorem LI is immediate; the content here is that, for 
0 < t < 4, there are asymptotically no eigenvalues outside the arc in (2.12|. 



Figure 1: The spectrum of the unitary Brownian motion with N = 400 and t = 1. These figures 
were produced from 1000 trials. On the left is a plot of the eigenvalues, while on the right is a 1000- 
bin histogram of their complex arguments. The argument range of the data is [—1.9392,1.9291], as 
compared to the predicted large-IV limit range (to four digits) [—1.9132,1.9132], cf. (2.121. 


To prove Theorem |1.1[ our method is to prove sufficiently tight estimates on the rate of convergence 
of the moments of Uff. We record the main estimate here, since it is of independent interest. 

Theorem 1.3. Let N, n e N, and fix t > 0. Then 


Etr [(U t N ) n \- f w n ut{dw) 
All 


< 


f 2 n 4 

W' 


(1.3) 


The tool we use to calculate these moments is the Schur-Weyl duality for the representation theory of 
Utv; see Section |L2] Theorems |l.l| and |1.3| are proved in Section[3] 
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The second half of this paper is devoted to a multi-time, multi-matrix extension of this result. 
Biane's main theorem in |5) states that the process {U^)t >o converges (in the sense of finite-dimensional 
noncommutative distributions) to a free unitary Brownian motion u = (ut)t> o- To be precise: for 
any k G N and times t\, ..., tk > 0, and any noncommutative polynomial P G C(X]...., X 2 k) in 2k 
indeterminates, the random trace moments of (U^)i<j<k converge almost surely to the corresponding 
trace moments of (ut -)i<j<fc: 


lim tr (P(Ul 

jV->oo V ] 




! u tl > ' 


> U tk > U t k 


)) 


a.s. 


(Here r is the tracial state on the noncommutative probability space where (ut)t >o lives; cf. Section [23] ) 
This is the noncommutative extension of a.s. weak convergence of the empirical spectral distribution. 
The corresponding strengthening to the level of the "hard edge" is strong convergence: instead of mea¬ 
suring moments with the linear functionals tr and r, we insist on a.s. convergence of polynomials in 
operator nor m. Se e Section [23] for a full definition and history. 

can be rephrased to say that, for any fixed t > 0, (U t N , (U t N )*) converges strongly to 


1.1 


Theorem 

(ut, Uf ) (cf. Corollary |3.3| . Our second main theorem is the extension of this to any finite collection of 
times. In fact, we prove a more general extension theorem, as follows. 


Theorem 1.4. For each N, let {U^)t >o he a Brownian motion on Uat. Let be random ma¬ 

trix ensembles in Mat all independent from {U^)t> o, and suppose that (A ^,..., A%) converges strongly to 
(ai, ..., a n ). Let {ut)t> o be a free unitary Brownian motion freely independent from {ai, ..., a n }. Then, for 
any k G N, and any >0, 


{Ai converges strongly to (ai,..., a n , u tl ,..., u t J. 

Theorem [T4] is proved in Section [4] 

We conclude the paper with an application of these strong convergence results to the empirical 
spectral distribution of the Jacobi process, in Theorem |5.7[ We proceed now with Section [2j laying out 
the basic concepts, preceding results, and notation we will use throughout. 
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2 Background 


Here we set notation and briefly recall some main ideas and results we will need to prove our main 
results. Section 


2.1 


introduced the Brownian motion (U x )t> o on Uy. Section |2.2| discusses the Schur- 


als in U x 


Weyl duality for th e rep resentation theory of Uy, and its use in computing expectations of polynomi- 

discusses noncommutative distributions (which generalize empirical spectral 


Section 


2.3 


distributions to collections of noncommuting random matrix ensembles, and beyond) and associated 
notions of convergence, including strong convergence. Finally, Section |Z4| reviews key ideas from free 
probability and free stochastic calculus, leading up to the definition of free unitary Brownian motion 
and its spectral measure u t . 


2.1 Brownian Motion on Uy 

Throughout, Uy- denotes the unitary group of rank N; its Lie algebra Lie(Uy) = uy consists of the 
skew-Hermitian matrices in My, uy = {X eMjy:X* = —A }. We define a real inner product on Uy 
by scaling the Hilbert-Schmidt inner product 

(X, Y) y = -NTr{XY), X , Y E uy. 


As explained in |20l| . this is the unique scaling that gives a meaningful limit as N —> oo. 

Any vector X E uy gives rise to a unique left-invariant vector field on Uy; we denote this vector 
field as dx (it is more commonly called X in the geometry literature). That is: d\ is a left-invariant 
derivation on C'°°(Uy) whose action is 


(dxf)(U) 


d_ 

dt 


f(Ue tx ) 
t =o 


where e tx denotes the usual matrix exponential (which is the exponential map for the linear Lie group 
Uy; in particular e tx E Uy whenever X E uy). The Laplacian Arj v on Uy (determined by the metric 
(•, •)y) is the second-order differential operator 


A Uiv = *52 9 X 
xe/3 N 


where /3y is any orthonormal basis for uy; the operator does not depend on which orthonormal basis 
is used. The Laplacian is a negative definite elliptic operator; it is essentially self-adjoint in L 2 (Uy) 
taken with respect to the Haar measure (cf. [41 |45|). 

The unitary Brownian motion U N = (U x )t> o is the Markov diffusion process on Uy with gener¬ 
ator | Ay v , with U x = In- In particular, this means that the law of U x at any fixed time t > 0 is the 
heat kernel measure on Uy. This is essentially by definition: the heat kernel measure p x is defined 
weakly by 

E p?(f) = J v fdp? = (ei A M/) (Jy), feC( Uy). (2.1) 


There are many tools for computing the heat kernel using the representation theory of Uy; we discuss 
this further in Section 2.2 We mention here the fact that the heat kernel is symmetric: it is invariant 
under U i-t IX 1 (this is true on any Lie group). 

There are (at least) two more constructive ways to understand the Brownian motion U N directly. 
The first is as a Levy process: JJ N is uniquely defined by the properties 


• CONTINUITY: The paths 1 1— > U x are a.s. continuous. 
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• Independent Multiplicative Increments: For 0 < s < t, the multiplicative increment 

is independent from the filtration up to time s (i.e. from all random variables mea¬ 
surable with respect to the entires of for 0 < r < s). 

• Stationary Heat-Kernel Distributed Increments: For 0 < s < t, the multiplicative incre¬ 
ment (U^)~ 1 has the distribution pf_ s . 

In particular, since U t N is distributed according to (>]', we typically write expectations of functions on 
Utv with respect to p; v as 

E p „(f) = E[f(U t N )}. 

For the purpose of computations, the best representation of U N is as the solution to a stochastic 
differential equation. Let X N be a GUE v -valued Brownian motion: that is, X N is Hermitian where 
the random variables [X N ]jj, ^.[X N ]jk, for 1 < j < k < N are all independent Brownian 

motions (of variance t/N on the main diagonal and t/2N above it). Then U N is the solution of the Ito 
stochastic differential equation 


dU t N = iU t N dX t N - - 2 Uf w dt , U 0 N = /tv. (2.2) 

2.2 Heat Kernel Computations and the Schur-Weyl duality 

There is no closed formula for the density of the heat kernel measure (>]' with t > 0, even in the case 
N = 1. For the purposes of computing limits (in distribution, cf. Section |2.3| , it is possible to compute 
many limiting moments due to a general feature of the Laplacian with the chosen scaling, it 

can be decomposed appropriately into a form D + for IV-independent operators D and L acting 
on an auxiliary space. Given a decomposition of the auxiliary space into invariant finite-dimensional 
subspaces, the boundedness of the exponential map then gives 

el^N ~ e ~2 D + O , 

which reduces the computation of limiting moments to the (combinatorially tractable) computation 
of the flow of D. In [20 s30], (31] and HQ, building on earlier work of [34[ 40. 43| and others, the aux¬ 
iliary space used was composed of so-called trace polynomials: polynomials in U, tr([/), tr(t/ 2 ),... In 
the complementary work of the second author lfl4l and preceding papers of Levy 1331 , an alternative 
approach relying on the representation theory of Utv was used. We outline this approach presently. 

Fix a positive integer n E N. Let S n denote the permutation group on n letters. For I < i / j < n , let 
(i j) G S n denote the transposition. For a G S n let denote the number of cycles in the permutation 
a. Define two linear operators L n and I) n on the (finite-dimensional) group algebra C [,S' n ] as follows: 
they are the linear extensions of 


L n (cr)= (T ‘(W’) ( 2 - 3 ) 

1 <«<j <n 

D n (a) = ^2 a ■ (h?) ( 2 - 4 ) 

1 

#( CT -(U'))<#°' 

When the context is clear, we may drop the index n (with the knowledge that D = D n and L = L n act 
only on the finite-dimensional space in which their argument lives). 
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(2.5) 


Let V be any vector space of dimension > 2. The standard action of S n on V - jn is given by 

a ■ (vi® ■■■ <g>v n ) = ^ 0 — 1 ( 1 ) ® ■ ■ ■ ® v a -i ( n ). 

This yields a faithful representation, and so we will typically identify a with the corresponding endo¬ 
morphism of V® n . We will be concerned with this action when V = C ;V , so that (fixing the standard 
basis of C jV ) End(C Ar ) = Mat. We may then readily verify that, if a G S n has cycle decomposition 
a = ci ■ ■ ■ c r with Cfc = {i\ ■ ■ ■ i* ), then for Ai,..., A n G Mjy, 

Tr M 0^ (u • Al <8) ■ ■ ■ <g) A n ) = TVm^ (Aq ■ ■ ■ A i i ) • • • TYm^ (Aq ■ ■ ■ A^) (2.6) 

where we have emphasized over which space each trace is taken. Note, in particular, that if a = 
(*!■■■ i n ) is a cycle, then Tr(Ai ® ■ ■ ■ ® A n ■ cr) = TV (A^ ■ ■ ■ Ai n ) is a single trace. 

We define a function : C[S n ] X Mat —> C, linear in the first argument, as follows: 




(2.7) 


The N# a in the denominator is a normalizing factor, since Tr(cr) = by ( |2.6| >; that is to say, 

Fn(<r, M) = Tr(er • M &n ) /Tr(cr), and so F^(a, In) = 1 for any a G S n . As usual, when the context is 
clear, we may drop the indices N , n and refer to the function as F(a, M). Note that any homogeneous 
degree n polynomial in the entries of M can be represented in the form F(a, M) for some element 
a G C [Sn] . This highlights the power of the following theorem. 

Theorem 2.1 (Levy, Schur-Weyl). Fix a positive integer N, a matrix M gMjv, and a time t > 0. Then for 
any n G N and cr G C[5 n ], 

E[F* (a,U t N M)\ = e"f Fff {e~ t{Ln+ ^ Dn) a,M). 

(One might expect to need a | in the exponential; in fact, this factor of \ has been built into the opera¬ 
tors L n and D n .) 


2.3 Noncommutative distributions and convergence 

Let (#/, r) be a W *-probability space: a von Neumann algebra .<?/ equipped with a faithful, normal, 
tracial state r. Elements a G xF are referred to as (noncommutative) random variables. The non¬ 
commutative distribution of any finite collection ai, ..., G A is the linear functional P( ai ,...,a k ) on 
noncommutative polynomials defined by 


^(ai,...,a fc ) : C(Ai, . . . , X k ) -)• C 

P r(P(m,.. .,a k )). 


( 2 . 8 ) 


Some authors explicitly include moments in cq, a* in the definition of the distribution; we will instead 
refer to the ^-distribution as the noncommutative distribution P( ai , a *,..., afc ,a*) explicitly when needed. 
Note, when a G s/ is normal, Ha, a i is determined by a unique probability measure Law a , the spectral 
measure of a, on C in the usual way: 

[ f{z, z) Law a (dzdz) = p aA * (/), / G C[X, X*} 

J C 

(i.e. when normal it suffices to restrict the noncommutative distribution to ordinary commuting poly¬ 
nomials). In this case, the support suppLaw a is equal to the spectrum spec(a). If u G si is unitary. 


7 



Law M is supported in the unit circle Ui- For example: a Haar unitary is a unitary operator in (st/, r) 
whose spectral measure is the uniform probability measure on Ui (equivalently r(u n ) = 5 n o for n € Z). 
In general, however, for a collection of elements ai,..., (normal or not) that do not commute, the 
noncommutative distribution is not determined by any measure on C. 

As a prominent example, let /I A be a normal random matrix ensemble in i.e. A N is a random 
variable defined on some probability space (ST, J£",P), taking values in Mjy. The distribution of ,4 v 
as a random variable is a measure on but for each instance u; E f l, the matrix A N (uj) is a non¬ 
commutative random variable in the W* -probability space M/y, whose unique tracial state is tr. In 
this interpretation, the law Law^Ar/^ determined by its noncommutative distribution is precisely the 
empirical spectral distribution 

1 N 

Law a n H 

3 = 1 


where \\(A N (u)),..., X n (A N (u)) are the (random) eigenvalues of A N . 

Let ) be a collection of random matrix ensembles, viewed as (random) noncommu¬ 

tative random variables in (Mjv,tr). We will assume that the entries of ,4 v are in L°°~(Ll, JP, P), 
meaning that they have finite moments of all orders. The noncommutative distribution P( A n a n^ 
is thus a random linear functional C(Xi,. ... X n ) —y C; its value on a polynomial P is the (classi¬ 
cal) random variable tr(P(Af r ,..., A^)), cf. (2.8 1 . Now, let (st/, r) be a W* -probability space, and let 
ai,... ,a n G st/. Say that (Af,..., A ) converges in noncommutative distribution to a\,..., a n al¬ 
most surely if /j,( A n a n) —> p( ai ,...,a n ) almost surely in the topology of pointwise convergence. That 
is to say: convergence in noncommutative distribution means that all (random) mixed tr moments of 
the ensembles A A converge a.s. to the same mixed r moments of the dj. Later, a stronger notion of 
convergence emerged. 


Definition 2.2. Let A N = (A^,..., A %) be random matrix ensembles in (Mjv, tr), and let a = (oi, ..., a n ) 
be random variables in a W*-probability space (st/, r). Say that A N converges strongly to a if A N converges 
to a almost surely in noncommutative distribution, and additionally 


\\P(A 


N 
1 > 



\\P(ai,...,a n )\\^ a.s. 


VPG C(X 1 ,...,X n ). 


This notion first appeared in the seminal paper Il24ll of Haagerup and Thorbjornsen, where they 
showed that if Xf ,..., Xf are independent GUE A random matrices, then they converge strongly to 
free semicircular random variables (x\, ..., x n ). The notion was formalized into Definition |2.2| in the 
dissertation of Camille Male (cf. H35l ), where the following generalization (an extension property of 
strong convergence) was proved. 

Theorem 2.3 (Male, 2012). Let A N = (A± ,..., A%) be a collection of random matrix ensembles that con¬ 
verges strongly to some a = (a\,..., a n ) in a W*-probability space (st/, r). Let X A = (X f ,..., Xjf) be 
independent Gaussian unitary ensembles independent from A N , and let x = (xi,..., Xk) be freely independent 
semicircular random variables in s/ all free from a. Then (A N , X iV ) converges strongly to (a, x). 

(For a brief definition and discussion of free independence, see Section [Zlj bclow. ) Later, together with 
the present first author in |[T3l , Male proved a strong convergence result for Haar distributed random 
unitary matrices (which can be realized as lim^oo Uf). 

Theorem 2.4 (Collins, Male, 2013). Let A a = (A ^,..., Alf ) be a collection of random matrix ensembles that 
converges strongly to some a = (a\,..., a n ) in a W*-probability space (st/, r). Let U N be a Haar-distributed 
random unitary matrix independent from A N , and let u be a Haar unitary operator in st/ freely independent 
from a. Then (A N , U N , (U N )*) converges strongly to (a, u, u*). 
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(The convergence in distribution in Theorem |2.4| is originally due to Voiculescu [[47]; a simpler proof of 
this result was given in llOl .) Our main Theorem [L4] can be viewed as combination and generalization 
of Theorems 12.31 and 12.41 

Note that, for any matrix A E and any operator a E sA, 


lim (tr 

p—> OO 


(. aa *) p / 2 


i /p 


and 


aWrf = lim (t (aa*) p / 2 ) 
p —^oo V L J / 


, 1 /p 


These hold because the states tr and r are faithful. These are the noncommutative L p -norms on 
L p (MAr,tr) and U’{xA. t) respectively. The norm-convergence statement of strong convergence can 
thus be rephrased as an almost sure interchange of limits: if A N converges a.s. to a in noncommuta¬ 
tive distribution, then A N converges to a strongly if and only if 


P ( lim lim ||P(A iV )|| iP(MjV!tr) = lim ||P(a)|| ^ ^ =1, V P E C{X lt .. .,X n ). (2.9) 

If we fix p instead of sending p -X oo, the corresponding notion of "//'-strong convergence" of the 
unitary Brownian motion (U^)t>o to the free unitary Brownian motion (ui)t >o was proved in the third 
author's paper |30| . This weaker notion of strong convergence does not have the same important 
applications as strong convergence, however. As a demonstration of the power of true strong conver¬ 
gence, we give an application to the eigenvalues of the Jacobi process in Section[5j the principal angles 
between subspaces randomly rotated by evolve a.s. with finite speed for all large N. 


2.4 Free probability, free stochastics, and free unitary Brownian motion 

We briefly recall basic definitions and constructions here, mostly for the sake of fixing notation. The 
uninitiated reader is referred to the monographs [49J138J, and the introductions of the authors' previous 
papers DT2J ; 3^1 for more details. 

Let (sA, r) be a W *-probability space. Unital subalgebras sA\,..., sA m C sA are called free or freely 
independent if the following property holds: given any sequence of indices ki,...,k n E {1,..., m} 
that is consecutively-distinct (meaning kj-i / kj for 1 < j < n) and random variables a :j E .e/y, if 
r{aj) = 0 for 1 < j < n then t(oi • • • a n ) = 0. We say random variables ai ,..., a m are freely indepen¬ 
dent if the unital *-subalgebras sij = (a 3 , a*) C sf they generate are freely independent. Freeness is 
a moment factorization property: by centering random variables a —> a — rfajl,,/, freeness allows the 
(recursive) computation of any joint moment in free variables as a polynomial in the moments of the 
separate random variables. In other words: the distribution pt ai ,...,a k ) °f a collection of free random 
variables is determined by the distributions p ai , p ak separately. 

A noncommutative stochastic process is simply a one-parameter family a = (at)t>o of random 
variables in some FW-probability space (aA. r). It defines a filtration: an increasing (by inclusion) 
collection sA t of subalgebras of sA defined by sA t = W* (a s : 0 < s < t ), the von Neumann algebras 
generated by all the random variables a s for s < t. Given such a filtration (M)t> 0 / we call a process 
b = (bt)t>o adapted if b t E sA t for all t > 0. 

A free additive Brownian motion is a selfadjoint noncommutative stochastic process x = (xt)t >o 
in a W *-probability space UA, r) with the following properties: 

• CONTINUITY: The map M + —> sA : t !-)• Xt is weak*-continuous. 

• FREE INCREMENTS: For 0 < s < t, the additive increment xt — x s is freely independent from sA s 
(the filtration generated by x up to time s). 

• Stationary Increments: For 0 < s < t, p Xt - Xs = p xt _ s - 
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It follows from the free central limit theorem that the increments must have the semicircular distri¬ 
bution: Law^ = \J (4t — x 2 ) + dx. Voiculescu (cf. ||46j [47] j49] showed that free additive Brownian 
motions exist: they can be constructed in any W *-probability space rich enough to contain an infinite 
sequence of freely independent semicircular random variables (where x can be constructed in the usual 
way as an isonormal process). 

In |6]iZl (and many subsequent works such as f32l ), a theory of stochastic analysis built on x was 
developed. Free stochastic integrals with respect to x are defined precisely as in the classical setting: as 
L 2 (.f/. r)-limits of integrals of simple processes, where for constant a <G f/J, f {] 1-, t As)adx s is defined 
to be a ■ ( xt + — xt_). Using the standard Picard iteration techniques, it is known that free stochastic 
integral equations of the form 


a t = a 0 + / f(s,a s )ds + / a(s,a s 
Jo Jo 


dx? 


( 2 . 10 ) 


have unique adapted solutions for drift f and diffusion cr coefficient functions that are globally Lips- 
chitz. (Note: due to the noncommutativity, we should really be integrating a biprocess (3t#dxt in the 
Ito term, where G xf ® sf so that it may act on both sides of x. A one-sided process like the one 
in ( 2.10| is typically not self-adjoint, which limits cf>, a to be polynomials, and ergo linear polynomials 
due to the Lipschitz constraint. That will suffice for our present purposes.) Equations like ( |2.10 l are 
often written in "stochastic differential" form as 


dat = at) dt + a(t , at) dxt.. 


Given a free additive Brownian motion x, the associated free unitary Brownian motion u 
the solution to the free SDE 

dut = iut dxt — -ut dt, uq = 1. 


(u t ) t >o is 


( 2 . 11 ) 


This precisely mirrors the (classical) Ito SDE (2.21 that determined the Brownian motion (U^)t>o on 
Utv- 


The free unitary Brownian motion (ut)t >o was introduced in (5J (via the definition above). In that 
paper, with more details in the subsequent 0|, together with independent statements of the same type 
in BOH , the law Law„ t was computed. Since ut is unitary, this distribution is determined by a measure 
/// that is supported on the unit circle Ui. This measure is described as follows. 


Theorem 2.5 (Biane 1997). For t > 0, ut has a continuous density gt with respect to the normalized Haar 
measure on Ui. For 0 < t < 4, its support is the connected arc 


supp u t = 



\^t{A-t) + 


arccos 



( 2 . 12 ) 


ivhile supp u t = Ui for t > 4. The density o t is real analytic on the interior of the arc. It is symmetric about 1, 
and is determined by gt(e ld ) = K«:t(e* e ) where z = nt(e l9 ) is the unique solution (zvith positive real part) to 

z - 1 t w 

- e i z = e w . 

z + 1 

Note that the arc \2.12) is the spectrum spec(ut) for 0 < t < 4; for t > 4, spec(ut) = Ui. 

With this description, one can also give a characterization of the free unitary Brownian motion 
similar to the invariant characterization of the Brownian motion (l// v )t>o on pagepl That is, {uf)t>o is 
the unique unitary-valued process that satisfies: 
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• CONTINUITY: The map M + .c/: f u/ is weak* continuous. 

• Freely Independent Multiplicative Increments: For 0 < s < t, the multiplicative incre¬ 
ment u~ l ut is independent from the filtration up to time s (i.e. from the von Neumann algebra 
si s generated by {u r : 0 <r< s}). 

• Stationary Increments with Distribution v. For 0 < s < t, the multiplicative increment 
u~ x ut has distribution given by the law v t - s . 
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3 The Hard Edge of the Spectrum 

This section is devoted to the proof of our "hard edge" theorem for the spectrum of a single time 


follows from Theorem 1.3 and recast the con- 


marginal Up . We begin by showing how Theorem 1.1 
elusion as a strong convergence statement in Corollary |3.3[ Section [372] is then devoted to the proof of 
the moment growth bound of Theorem |1.3[ 

3.1 Strong convergence and the proof of Theorem |1.1| 

We begin by briefly recalling some basic Fourier analysis on the circle Ui. For / 6 L 2 (Ui), its Fourier 
expansion is 

f(w) = V f(n)w n , where f(n) = [ f{w)w~ l dw, 

n £ Z 

where dw is the normalized Lebesgue measure onUi. For p > 0, the Sobolev space H p { Ui) is defined 
to be 


H P ( Ur) = \f€ L \UO : \\f\\ 2 Hp = ^(! + n 2 f |/(n )| 2 < oo 


(3.1) 


nEZ 


If £ > k > 1 are integers, and £ > p > k + then C £ ( Ui) C H p ( U) C C fc (Ui); it follows that 
-ffoo(Ui) = n p >o^(Ui) = C°°(Ui). These are standard Sobolev imbedding theorems (that hold for 
smooth manifolds); for reference, see H22l Chapter 5.6] and [42, Chapter 3.2], 


Theorem 1.3 yields the following estimate on moment growth tested against Sobolev functions 
disjoint from the limit support. 

Proposition 3.1. Fix 0 < t < 4. Let f € H§( Ui) have support disjoint from supp v t . There is a constant 
C(f) > 0 such that, for all N e N, 

|Etr [ f (U t N )}\<^fi. (3.2) 

Proof Denote by uff ri) = Etr[(t// V ) n ] and ut(n) = / Ui w n ut(dw) = limjv^-oo HH- Expanding / as a 
Fourier series, we have 


Mr[f(U t N )) = ^f(n)Etr[(U i 

nEZ 


N\n-\ 


nE Z 


By the assumption that supp / is disjoint from supp v t , we have 

0=1 f dv t = V f(n) [ w n v t (dw) = V f(n)u t (n). 

ne z ^ nez 

Combining ( |3.3| > and (3.41 with Theorem [T3] yields 


(3.3) 


(3.4) 


| Etr [f{U t N )}\ < X I/HII^H ~ vt(ji)\ < X I/HI 


nEZ 


nEZ 


t 2 n 4 

~w 


By assumption / £ H-fUf), and so 


XI n4 1 /( 

nEZ 


n = 


X --n 5 |/HI< 

zJ n 

nE Z\{0} 


(£ h) 

\nez\{0} / 


1/2 


1/2 


X- 10 I/(HI 2 

\n£ Z / 


< 


7r 

Vi 


h 5 < OO. 


Taking C(f) = h 5 concludes the proof. 


n. 
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We now use Proposition |3. 11 to give an improved variance estimate related to Il34l Propositions 6.1, 

6 . 2 ], 

Proposition 3.2. Fix 0 < t < 4. Let f E C 6 (Ui) with support disjoint from supp u t . There is a constant 
C'(f) > 0 such that, for all N E N, 

Var[Tr(/(l/f))] < . 

Proof. In the proof of |[34j Proposition 3.1] (on p. 3179), and also in |9j Proposition 4.2 & Corollary 4.5], 
the desired variance is shown to have the form 

Var[T \if(U t N ))] = f Etr[/'(t/f V t N _ s )f'{U^W t N _ s )} ds (3.5) 

Jo 

where U N , V N , W N are three independent Brownian motions on U ; y For fixed s E [0, t\, we apply the 
Cauchy-Schwarz inequality twice and use the equidistribution of Uf T Vff s and Ujf WfL s to yield 


|Etr[/'([/i v ^)/ , (Ci v W t i !j]| <E |tr[/'(t/f^J 2 ]| 1/2 -|tr[/'(t/i v W^ s ) 2 ]| i/Z < Etr [f'(U?V£ 


11/21 


Since U N and V N are independent, (Ujf, Vff_ s ) has the same distribution as (Ujf ■ (f/^) -1 Uf 1 ) (as the in¬ 
crements are independent and stationary). Thus Etr [/'(U^Vffg) 2 ] = Etr \f(U^) 2 ], and so, integrating, 
we find 

Var[TY (f(U t N ))} < tEtr[f’(U t N ) 2 }. (3.6) 

Since / E C' 6 (Ui), the function (/') 2 is C 5 C , and the result now follows from Proposition [3d] with 

C\f) = C((f) 2 ). o 


This brings us to the proof of the "hard edge" theorem. 


Proof of Theorem [O ] assuming Theorem 1.3 
a C°° bump function with values in [0,1] 


Fix a closed arc aCDi that is disjoint from supp v t . Let / be 
such that f\ a = 1 and supp / n supp v t = 0. Then 


P(spec (U t N ) Haf 0) < P(Tr [f(U t N )j > 1). 


(3.7) 


We now apply Chebyshev's inequality, in the following form: let Y = Tr[f(Ujf)]. Then, assuming 
1 — E(F) > 0, we have 


p(y > i) = p(y - E(y) > 1 - E(y)) < 


Var(y) 

MOf' 


In our case, we have |E(y)| = |ETr[/(f7/ v )]| = A^|Etr[/(17/ v )]| < 


t 2 c(f) 

N 


by Proposition 


3.1 


Thus, there 


is No (depending only on / and t) so that (1 — ETr[/([// v )]) 2 > \ for N > N () . Combining this with 
( |3.7| yields 

for N > No. 

2 t 3 C'(f) 


P(spec (U?) n af0)< 2Var[TF(/(t/f))] 


Now invoking Proposition 


3.2 


we find that this is < 


— iW 


whenever iV > A7 0 . It thus follows from 


the Borel-Cantelli lemma that P(spec(t// V ) fla/0) = O for all but finitely many N. 

Thus, we have shown that, for any closed arc a disjoint from supp u t , with probability 1, spec (U^) 
is contained in Ui \ a for all large N. In particular, fixing any open arc 8 C U| containing supp v t , this 
applies to a = Ui \ /3. I.e. spec (U^) is a.s. contained in any neighborhood of supp v t for all sufficiently 
large N. This suffices to prove the theorem: because Law ^jv converges weakly almost surely to the 
measure u t which possesses a strictly positive continuous density on its support, any neighborhood of 
the spectrum of U f N eventually covers supp v t . □ 
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Thus, we have proved Theorem 11.1 1 under the assumption that Theorem [T3] is true. Before turning 
to the proof of this latter result, let us recast Theorem [hljin the language of strong convergence, as we 
will proceed to generalize this to the fully noncommutative setting in Section |4] 

Corollary 3.3. For N £ N, let ( U N ) t >o be a Brownian motion on Ujv- Let ( ut)t>o be a free unitary Brownian 
motion. Then for any fixed t > 0, (U f N , (Uf 7 )*) converges strongly to (u t , vf). 

Proof Since Uf 7 —>■ ut in noncommutative distribution, strong convergence is the statement that 

np^^nii^iiPKoii 


in operator norms. Fix a noncommutative polynomial P in two variables, and let p be the unique 
Laurent polynomial in one variable so that P(U, U *) = p(U) for every unitary operator U. Since Uf 7 is 
normal, \\p(Uf 7 ) || = max{|A|: A 6 p(spec(C/ t Ar ))}; similarly, ||p(ii t )|| = max{|A|: A £ p(supp n t )} where 
supp u t is the arc in ( 2.12\ . 

Let A^ = |p|(spec(C// v )), and let A p = |p|(suppz^). Since spec(C/ t iV ) converges to supp v t in Haus- 
dorff distance and all the sets are compact, it follows easily from the continuity of \p\ (on the unit 
circle) that A p converges to A p in Hausdorff distance as well. Now, suppose for a contradiction that 
max Ap' does not converge to max A p . Note that A p is compact so contains its maximum, and since 
A 1 ^ converges to the compact set A p in Hausdorff distance, it follows that the sequence (maxA^)“ =1 
is bounded, hence contains a convergent subsequence (max A^ fc )^ =1 . Denote the limit of this sequence 
by m. Now, A p k converges to A p in Hausdorff distance; in particular, for any fixed e > 0, K k ^ ( A ^ 
(the set of all points distance < e away from A p ). Hence max Ap k £ A p k < max A p + e, and so the limit 
m < max A p + e. This holds for each e > 0, and so m < max A p . By assumption rn f max A p , and so 
m < maxAp. 


Thus linn. 


, max A p k 


< maxAp. In particular, there is some 6 > 0 so that, for all large k, 
A p in Hausdorff distance implies that A p C (max A ^ k )s 
for all large k. This is a contradiction. So we have shown that \\p(U f N ) || = max A*)' —)■ maxA p = ||p(u t )||. 


max Ap h < max A p — 5. But the fact that A 


as desired. 


□ 


Remark 3.4. In fact, the converse of Corollary |3.3| also holds: strong convergence of U t N -£ ut (for a fixed 
t < 4) implies convergence of the spectrum in Hausdorff distance. Indeed, suppose we know strong 
convergence. Since all the operators involved are unitaries, we may extend the test function space to 
continuous functions on the unit circle HJi: let / £ C( Ui), fix e > 0, and by the Weierstrass approxima¬ 
tion theorem, choose a polynomial with ||p — fWr^m ,) < f. Applying unitary functional calculus, this 
means || p(U) — f(U) || < | for any unitary operator U. By assumption of strong convergence, we know 
\\\p{U t N )\\ — ||p(rt()||| < | for all large N, and it therefore follows that |||/(H t iV )|| — ||/(«t)||| < e for large 

N.So\\f(U t N )\\^\\f(ut)\\. 

Now, let a C Ui be a closed arc disjoint from supp v t , and let / be a continuous bump function with 
values in [0, 1] such that f\ a = 1 and supp / n supp ip = 0. By the strong convergence assumption, 
ll/(^ r t V )ll \\f( u t)\\', hut supp/ is disjoint from the spectrum of u t , so ||/(ut)|| = 0. Hence, f(Ujf) -£ 0 
in norm, which shows that Uf asymptotically has no eigenvalues in a. As above, this shows that 
spec(C/ t JV ) is eventually contained in any neighborhood of supp up, the other half of the convergence 
in Hausdorff distance follows from the convergence in distribution (and strict positivity of the limit 
density v t on supp vf). 

When t > 4, supp v t = Ui, and strong convergence becomes vacuously equivalent to the known 
convergence in distribution. 
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3.2 The proof of Theorem 1.3 


We will actually prove the following Cauchy sequence growth estimate. We again use the notation 

u t N (n)=E[tr(U t N n 


Proposition 3.5. Let JV,jigN, and fix t > 0. Then 


■L(") - i,f K (n)\ < 


3 t 2 n 4 
4iV 2 ' 


(3.8) 


This is the main technical result of the first part of the paper, and its proof will occupy most of this 
section. Let us first show how Theorem [L3] follows from Proposition [33] 

Proof of Theorem \l .3\ assurnhig Proposition |3.5| Since lim/y^oc zy v (n) = v±{n), we have the following con¬ 
vergent telescoping series: 


■L(") - "(Ml = 


k =0 


^ N2 k / \ N2 k+1 / \ 

< 2^ I v t ( n ) - v t ( n ) 

k =0 


Now apply ( |3.8[ ) with N replaced by N2 k , we find 

N2 k [ \ N2 k+1 ( \ 

v t ( n ) - v t (n) 


< 3 f 2 n 4 


4 (N2 k ) 2 


3 1 t 2 n 4 
44klf2- 


Summing the geometric series proves the theorem. □ 

Remark 3.6. The bound ( |1.3) on the speed of convergence (n) v t (n) depends polynomially on 
n; this is crucial to the proof. In H30l Section 3.3], the author proved a bound of the form K(t. n)/N 2 , 
where K(t, n) ~ exp(^-). This growth in n is much too large to get control over test functions / 

that are only in a Sobolev space, or even in C°° (Ui); the largest class of functions for which this Fourier 
series is summable is an ultra-analytic Gevrey class. That blunter estimate was proved not only for 
U N , however, but for a family of diffusions on GILat including both U N and the Brownian motion on 
GLjv- It remains open whether a polynomial bound like ( |1.3| ) holds for this wider class of diffusions. 

Hence, we turn to the proof of Proposition |3.5| Fix a Brownian motion U 2N on U 2 at, along with two 
Brownian motions U N ' 1 , U N:2 on U.y, so that the processes U 2N . U N)l , U N ’ 2 are all independent. For 
t > 0, let B 2N e U 2 .V denote the block diagonal random matrix 


B 2N = 


u, 


N,1 


0 a 


0 

JV,2 


E UojV ■ 


Let us introduce the notation 


/\2N _ tt2N jd2N 

A q — U t _ s B s 


as this process will be used very often in what follows. 

Now, using the notation of ( |2.7| , for any n E N and any element a E C[,S’ r) ], we define 


F(s,a)=E[F 2N (a,Ai ")] 


\2N\ 


(3.9) 


where for readability we hide the explicit dependence of F(s, a) on N, n, t. Taking s = 0, since /i ( 2A = 
I2Nr this gives E[Ff N (a, U 2N )], while taking s = t yields E [Ff N (a, B 2N )\. Now taking a = (1 • • • n) to 
be the full cycle (so that N# a = N), from the definition of normalized trace we have 

F(0, (1 • • • n)) = v\ N (n), F(t, (!••■«)) = ^f(n). 
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(3.10) 


Thus, the quantity we wish to estimate may be computed as 

J'fW - rf N {n) = ~^F(s, (1 • • • n)) ds 

provided this derivative exists. We now proceed to show that it does, and compute it. 

Denote by Pi. P 2 £ M2 n the projection matrices from C 2N = C N © C N onto the two factors; to be 
precise. Pi = diag[l,..., 1, 0,..., 0] and P 2 = I 2 N — Pi- For any A, B £ M 2 N and 1 < i < j < n, denote 
by (A © B)ij the matrix 

(A © B) id = I® 1 - 1 © A © I®*-*- 1 © B © I® n ~ j (3.11) 

where I = I 2 N is the identity matrix in M 2 N. 

Lemma 3.7. Fix t > 0 and N,n £ N, and let G: [0, t} —> Mf^ denote the function G(s) = E[(A 2Ar )® n ], Then 
G £ C' 1 [0, t], and 

G '(s) = ^G(s) Y, (*i)[/-2(Pi<»Pi) ij --2(P 2 ®P 2 ) iJ ] (3.12) 

where ( ij ) denotes the Schur-Weyl representation of the transposition in S n , cf. ( |2.5| >. 

Proof. To begin, note that U 2N and B 2N are independent, and so it follows that 

G(s) = E[(t/ t 2 JV ) ®n ]E[(jB 27V^n] = Gl ( s )G 2 ( s ), (3.13) 

where both factors G \, G 2 are continuous (since they are expectations of polynomials in diffusions). Us¬ 
ing the SDE ( |2.2| and applying Ito's formula to the diffusion B 2N shows that there is an L 2 -martingale 
(M 2N ) s >o such that 

1 2 

d (( B 2N )® n ) = dM s - ™(B 2N )® n dt - - Y • ( E a+m,b+eN ® E b+ w, a+ mkj ds 

l<i<j<n 1=1 
l<a, 6 <AT 

where E c d £ is the standard matrix unit (all 0 entries except a 1 in entry (c, d j) with indices 
written modulo 2N. This simplifies as follows: recalling that we identify an element a £ C [S n \ with a 
matrix (in this case in M 27 V) via the faithful action ( |2.5| , we can write this SDE in the form 

d{(B 2N )® n ) =dM s -^(B 2 s N T n ds- 1 Y Y{B 2 s N )® n {ij){P l ®P,\ j ds. (3.14) 

l< 2 <_ 7 <n £=1 

It follows that G 2 £ C 1 [0, t\, and 

G' 2 (s) = ~G 2 (a) ~jjY ® (3-15) 

l<i<jf<n i=\ 

At the same time, a similar calculation with Ito's formula applied with ( |2.2| > shows that there is an 
L 2 -martingale (M 2N ) S >0 such that 

d{{U 2 s N )®n)=dM s - r £{U 2 s N r n ds -^ Y (U 2N )® n (ij)ds 

l<z<_ 7 <n 
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(3.16) 


(3.17) 


which, changing s i-t t — s, implies that G\ is C 1 [0, t] and 


0' 1 (s) = ^G 1 {s) + P Y. E[W-.)®“(‘i)]. 


Combining ( 3.17[ > and ( 3.15| l, the product rule G'(s) = Gi(s)G' 2 (s) + G 2 (s)G' 1 (s ) shows that G e C 1 [0,t]. 
Using G = 6'i ■ G‘> again when recombining, we see that the § terms cancel; moreover, the same 
recombination due to independence yields 

<?(») = A E E [W->)®”(U)(Sf)®“] - 7 V E>[(0®"(ij)(«®^)«]- 

l<i<j><n l<z<j<n £=1 

Finally, in the first term, notice that (z j)(Bf)® n = (Bf)® n (ij) (since the Schur-Weyl representation of 
any permutation commutes with any matrix of the form B® n ). Hence, we have 

E[(U t N s f n (ij)(B^} = E[(E7* a )®"]E[(i j)(B ?)®"] = E[(tf£ a )®"]E[(iJ* )®"(i j)] = G(s)(ij). 

Similarly factoring out the G(s) from the second term yields the result. □ 

Now, note that 

1 „„ _ 1 

(3.18) 


F(s, a) = E[j£ w (a, A™)] = TrrrU-ETVIa . (.4™)®"] = -V^-Trlcr . G(»)]. 


(2 N)#° 


(2 N)#° 


This shows that F(-, a) € C' 1 [0, t], and so ( |3.10| is valid. Further, we can use ( 3.12 1 to compute the 
integrand there. To that end, we introduce the following auxiliary functions: for p,q £ N and s <E [0. t], 
denote 


1 


h p ,m = ^nM(A 2 s N rm(^n 


\2N\q\ 


2 N 2 


J2mii(Al N rP e )Tr((A 2 s N yP e )]. 


(3.19) 


1= l 


Lemma 3.8. For 0 < s < t, 


ds 


F(s, (1 n)) — Hp jn — p (s). 


n -1 


(3.20) 


p =i 


Proof. Applying ( |3.18 l with a = (1 ■■■ n), and using ( 3.12 1, we have 

2f( s ,(1...„)) = T ? 4V[ (1 ...„)G'( S )] 


— Y 

N 2 


41V 2 


l<i<j<n - 


Tr[(l ••• n)G(s)(i j)\ -2^Tr[(l n)G(s)(i j)(P e <S> Pe)i,j 


l=i 


Using the trace property and noting that (z j)(l ■ ■ ■ n) = (1 • • • z — 1 j • • • n)(z • • • j — 1), a simple 
calculation shows that 

Tr[(l • • ■ i - 1 j ■ • ■ n)(z • • • j - l)G(s)] = E[Tv((A 2 /y^)Tr((A 2 s N rAj-i)^ 


A similar calculation shows that 

Tr[(l ••• n)G( 8 )(ij)(P t ®P t )ij] = E[TV((A^)U^) TV ( (j4 2 iV)n- 0 -i) p ^ ]> 
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Thus, we have 


± F(a , (!...„)) 



l<i<;<n 


E[Tr((^ iV ) J '- i )Tr((^ JV ) n "^^ ) )] - 2E[Ti({A 2 s N ) j - i P e )Tii(A 2 s N ) n -( j - i) P e )] 


Breaking this into two sums, each has the form J2i<i<j< n ^j-i,n-(j-i) f° r some symmetric function 
h : {1,..., n — l} 2 —> C. For such a sum in general we have 


S = 


E h >- 




l<i<_ 7 <n 


n— 1 7i—l 

hp,n—p = y~^ ( n — P)hp t n-p 

p=l 1 <i<j<n p=l 

j—i=p 


since the number of (i. j) with 1 < i < j < n and j — i = p is (n — p). Now using the symmetry and 
reindexing by q = n — p we have 


71—1 71—1 71—1 71—1 71—1 

2 S — ^ P)hp,n—p ^ ^ Q^n—q^q = ^ P)^p,n—p ^ ^ php,n—p = ^ ^ ^ hp,n—p- 

p= 1 g=l p=l p=l p=l 

Applying this with the above summations yields the result. 

From ( |3.20| and ( 3.1 0| , we therefore have 

n-l pt 


L x rZ 

z'fW - H / H p^-p(s) ds. 

P = l •'° 


□ 


(3.21) 


It now behooves us to estimate the terms Pl v>n - p , cf. ( 3.1 9| . Since the result (Proposition |3.5| gives an 
0( 1/A 2 )-estimate, we must show that H p _ q (s) = 0(l/N 2 ). Note, however, that ( |3.19| involves expec¬ 
tations of unnormalized traces of powers of A 2N . As this process possesses a limit noncommutative 
distribution in terms of normalized traces, the leading order contribution of the first term in H Ptq is 
0(1). In fact, there are cancellations between the two terms: H pq (s) is actually a difference of covari¬ 
ances. 


Lemma 3.9. For s > 0 and p,q € N, 


1 


N 2 H m (s) = -Cov[Tr((A 2Ar ) p ), Tr((A 2Ar ) 9 )] - Cov[TV((A 2JV ) p Pi), TV((A 2JV ) 9 Pi)]. (3.22) 


Proof. From (3.191, N 2 H Ptq (s) is a difference of the two terms. The first is 


jEiTva^m'd/ify)] 


= jCov[H'((Afy),TV((.4“)’)] + |e[T>((^" F)]E[TV((Af)«)]. 


The second term is a sum 


^E[Tr((A 27V ) p ^)Tr((A 2JV )^ £ )]. 


e =i 


(3.23) 
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Let R be the block rotation matrix of C 2N by | in each factor of C N , so that RP\R* = P 2 . Since the 
distribution of A 2N is invariant under rotations, it follows that 

E[Tr ((A 2N ) p P £ )} and E[Tr((A 2 s N yP £ )Tr((A 2 s N yP e )} 

do not depend on t (as each is a conjugation-invariant polynomial function in A 2N ). In particular, the 
two terms in the Lsum are equal, and so the second term in H pq ($) is 

- E[Tr((A 2N yP e )Tr((A 2 s N yP e )} 

= -Cov[IV((A^)Pi\),'n:((A2 JV ) 9 PL)] -E[Tr((^ iV ) J, Pi)]E[T r ((^)9p 1 )]. 

Moreover, since E[Tr((A 2 Ar ) p Pi)] = E[Tr((A 2 Ar ) p P 2 )] arid P\ + P 2 = I, we have E[Tr((A 2 Ar ) p Pi)] = 
2E[Tr((A 2Ar ) p )]. Thus, the last term in ( |3.24[ ) is — 2E[Tr((A 22 V ) p )]E[Tr((A 2JV )' ? )]. Combining this with 
( 3.23) then yields the result. □ 

Therefore, we are left to estimate these two covariance terms. We do so by expanding them in terms 
of the Schur-Weyl representation. Let 7 ,, <G S n be the full cycle and let 7 Pj(? denote the double-cycle in 

Sp+q- 

7ra = (1 ■■■ n), lp,q = (1 • • • p)(p + 1 • • • P + q) G S p+q . 

Then, for any matrix A e M 2 V, ( |2.6| gives 

Tr(A n ) = Tr [A 0n ■ 7n ], Tr(A p )TY(A 9 ) = Tr[A 0p <g> A® q • 7p) ,]. 


It follows that 


Cov[Tr((^) p ),Tr((^)^)] 

= E[Tr((^ iV ) p )TV((A2 Ar ) ( ?)] - E[TV((^)P)]E[Tr((^ iV )9)] 

= ETr[(A 27V ) 0p ® (A 2N )® q • 7pi9 ] - ETr[(^)® p . ^jETr^*)®® 

= Tr[E((A 2Jv f (p+<?) ) . _ Tr\E({A 2 a N )®r) (8 E((^)® p ) . 7m ] 


= Tr 


E((^)®( p +?)) _e((A 2 jv )® p ) ®E((A 2N )® p ) 


Tp,q 


lq\ 


(3.25) 


At the same time, using the fact that the projection P\ are diagonal, we have for any matrix A 6 M 2 n 


Tr(A p Pi)TV(A«Pi) = Tr[A 0p <g> A® q • {P x ® P x ) p , p+q • 7 p l9 ], 


where we remind the reader that (P £ ( 
above confirms that 


) P()i.j references notation (3.11). A similar calculation to the one 


Cov[Tr((Af) p P 1 ),Tr((Ar) 9 J Pi)] 


2N\ 


=Tr 


E ((A 2 S N )®^) - E ((A 2 S N )®*>) ® E((0® p )J • (Pi ® P 1 ) P ,p+ 9 • 7 p, 9 ) • (3-26) 

Thus, from ( 3.22 1, ( 3.25| >, and ( 3. 26) , to estimate H p/n - p (s) we must understand the tensor 

E((^2V)®n) _ E (( A 2 0 E((A2 7V )®(”- p )). (3.27) 


To that end, we introduce some notation. For 1 < % < j < n, define the linear operator p 7 on (C 2;V ) 0n 

b y 

Tij = 2 (ij) [Pi (gi Pi + P 2 <8> P2\ij ■ (3.28) 
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Additionally, for 1 < p < n, we introduce the linear operators <J> P and x l’ p as follows: 

1 . 1 

~2N 


<!>„ = —- 
p 2 N 


E (*•?')’ 


E 




or 

p<i<j<n 


or 

p<i<j<n 


with the understanding that, when p = n, the sum is simply over 1 < i < j < n. 
Lemma 3.10. Let p e {1,..., n}, and let 0 < s < t. Then 


where, in the case p = n,we interpret the 0-fold tensor product as the identity as usual. 
Proof. Returning to the SDEs (3.14[> and (3.16}, taking expectations we find that 


—E[(t/ 2iV )® n ] = - -EUU 27V )® n ] - —E 
ds s ’ J 2 ' J 2 N 

-f- E[(B 2 s N )® n ] = --E[(B ; N )® n 1 - —E 
ds LV s ’ J 2 LV s ’ J 2 N 


(U t 

(B, 


2N\®n 
s ) 


2 N\®n 
s ) 


E (*j) 

l<i<j<n 

E r « 

l<2<j<n 


(3.29) 


as ,<j) 

2 e s9p , 

(3.30) 


(3.31) 

¥ e (t-s)®p e sV P ^ 

(3.32) 


(3.33) 

(3.34) 


Since the processes B 2N and U 2N start at the identity, it is then immediate to verify that the solutions 
to these ODEs are 


E[ (^,» n = e - ¥exp ^__ £ (iM, and E[(S”H = e-?np Y, T '-> 


Now, for the tensor product, we decompose 

E[(t/ 2JV )® P ] <g> E[(U 2N )®( n ~ p) ] = (e[([/ 2iV )® p ] • (7 0p ®E[(L^)^ n " p) ]) . 

We can express these expectations as in ( 3.33[ >, provided we note that (ij) now refers (alternatively) to 


action of S p and S n - P on ^n. 2N 
result is that 


2N ^ lvr 2N 


(n-p) 


, either trivially in the first factor or the second. The 


E[(Urr p \®I^ n - p) = e-^ex P {-BL £ {ij) 


1 <i<j<P 


(n—p)s 


exp ^~w E w?) 


® E [(U 2N )®( n ~ p) } = e~ 2 

I Z / V 

p<%' <j' <n 

Note that all the (ij) terms in the first sum commute with all the (i' f) terms in the second sum (since 
i < j <i' < j'), and so, taking the product, we can combine to yield 


E[(t/ 2Ar )® p ] ®E[(t/ 2iV )® (n - p) ] = e~^ 


exp < 


s 

"2N 


E (*j) 


l<i<j<p, or 
p<i<j<n 


— US e(T) 

> = e 2 e p , 
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verifying ( |3.30 l. An entirely analogous analysis proves ( 3.31 1 . 
Finally, using independence as in (3.13| to factor 


E [( j4 2iV)® P ] 0 E [( A 2iV)®(n-p)] 

= (E[(C/ t 2 i v J®P]®E[(C/^)®( n -P)]) • (E[(.B 2 *)®*]®E[(.B 2JV )® (n "' I,) 

and substituting s i-t t — s in ( 3.30 1, ( 3.32} follows from ( 3.31 [ l and ( 3.35| . 


nt (t-s)$„ S<£n _ Ot-s)$p S'Pp 


(3.35) 


□ 


(3.36) 


Hence, the desired quantity ( |3.2 7) is computed, via ( |3.32[ ), as 
E((A 2iV )® n ) - E((A 2iV )® p ) ® E((A 2Af )®( n - p )) = e~^ 

To compute this, we recall Duhamel's formula: for complex matrices C. D, 

e c - e D = [ e (1 " M)c (C - D)e uD du. 


To apply this to ( 3.36| , we add and subtract e' L s )^p e s ^ n . Duhamel's formula then expresses this dif¬ 
ference as follows: 

Jt-s)<S> n sV n _ Jt-s)<S> p s'S’p 


Jt-s)<b n _ e (t-s)<S> v 
fl 


gS’J'n _ gStfp 


gS'I'n _|_ e (ts)®p 

= I e ( 1 -«)0-*)*n( t _ s )( $ri _^) e «(t- S )$p du . g^n + g(t- S )$p / g(l-X^s^.^Je^Pdu 

Jo Jo 

= f * (<F n - %)e u ^e s ^ du + [* (tf n - 9 < p ) e u du 

Jo Jo 

where we have made the substitution u (t — s)u in the first integral and u sum the second. Now, 
from \3.29\ , 


% - 2N 


77 and = S Ti ’i- 


l< 2 <p<ji<n 




Hence, ( 3.36| l yields 

E((A 2iV )® n ) - E((A 2iV )® p ) <g> E((A 2iV )®( n - p )) 


_ rvt 

e 2 
' 2 N 


V ( f S e^ s ~ u) ^(i j)e u ® p e s ^ n du + T cfo) . 

\J o Jo ’ ) 


(3.37) 


l<i<p<j<n 


We now reinterpret the exponentials in the integrals as expectations of the processes U 2N and B 2N , 
using ( 3.30| and ( |3.31 1. The first integrand is 

e"f e (*-«-“)*»(* j) e «*p e s *" = E[(U?* s _ u )® n ] • (i j) • E[(t/ 2iV )® p ] ® E[(C/ 2iV )® (n - p) ] • E[(5 2JV )® n ], 

By definition f/ 2Y and /i 2V are independent. Let us introduce two more copies V 2N , W 2N of U 2N so 
the {U 2N , V 2N . W 2N , B 2N } are all independent. Then this product maybe expressed as the expectation 
of 

R P iM\ 8, , t) = (U™_ u )® n • (ij) • (V™)*r ® (W tt 2JV )®M • (B 2N )® n . (3.38) 
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Similarly, if we introduce independent copies C 2N and D 2N of B 2N so that all the the processes V 2N , 
w 2N , b 2N , c 2N , d 2N are independent, the second integrand can be expressed as the expectation of 


Q\ Au- a, t) = (V™)®? 0 (. (B™ u )® n • T id • {C 2 U N )® P 0 [D 2 u n )®^~ p \ 


(3.39) 


To summarize, we have computed the following. 


Lemma 3.11. Let 0 < s < t and p £ {1,..., n}. Then 


E (( A 2JV)®n) _ g, E (( A 2iV)®(n- P )) 

= Tm (f l&[Ri j (u;t,s)\du+ f IE [Q% d (u-,t,s)]du 

2N l<i<^<j<n J ° 


(3.40) 


We will now use this, together with ( 3.22| , ( 3.25 1, and ( 3.26| , to estimate Hp. n -p(s) 
each of the two covariance terms separately, in the following two lemmas. 


We estimate 


Lemma 3.12. Tor p e {1,..., n} and 0 < s < t, 


|Cov[Tr((,4 2JV ) p ), Tr((,4 27V ) 9 )] | < p(n - p)(t + 3s). 

Proof. From (|3.40[) together with |3.25 1, the quantity whose modulus we wish to estimate is 


(3.41) 


E 

1<2<P<J<7T 


rt-s ^ 

-ETr 

, 2 N 


s ) ' Tp,n—p 


dU + J 0 2V™' 


Q\ j(^i ^> s) • 7p,n —p 


du 


For the first term, we note (U?ff s _ u )® n 'Jp^n-p = 7 p,n-p(U 2 ^ s _ u )® n (the Schur-Weyl representation of any 
permutation in S n commutes with a matrix of the form M® n ). It follows from the trace property that 


ETr[hf>; s,t) • 7p ,n-p] = ETr[(F; 2Ar )® p ® (W 2N )®^ • (B 2N )® n • 7p , n _ p • {U 2N s _ u )® n • (ij)] 

= eti-[(k 2 a t p ® ( w 2 u N r {n ~ p) • (. B 2 s N r n ■ {u™_ u r n . lp , n . p . (*j)]- 

Since i < p < j, the permutation 7 p , n -p{ij) is a single cycle. Thus, by the ®n-fold trace reduces 
to a trace of some p, n, i, j-dependent word in Vf N , W 2N , B 2N , and U 2 ]f s _ u . This word is a random 
element of ILEv, and hence 

ET \{R P Au\ s, t ) • 7 p, n -p] = —— ETr[a random matrix in U 2 v] which has modulus < 1. 
Hence, the first integral is 



2N 


-ETr 


^1 s ) ‘ 7 p,n~P 


du 


< ( t-s ). 


(3.42) 


For the second term, the fact that T, : j = 2 (ij) [Pi 0 Pi + P> 0 Pif.j only acts non-trivially in the i, j 
factors, and i < p < j, shows that (as above) T t .j commutes with (C 2N )® P 0 (D 2N )®( n ~ p K Hence, we 
can express the second integrand as 


ETr 

=ETr 

=ETr 

2 


s ) ‘ 7 p,n—p 

{V 2 _ N S )® P 0 (W t 2 4) 0 ( n - p ) ■ (B™ u )® n ■ T id • ( Cl N )® p 0 [D 2 u N )®( n ~ri • 1P)U _ P 

Xv™)*P 0 (W 2 4 ) 0 M • {B™ u )® n • Tij ■ 'fp^n-p ■ ( C 2N f p 0 


=2 £ ETr [( C 2N )® P 0 ( D 2 u n • (V 2 _ N S )® P 0 (W£*)® ( "~ p) . (B 2N u )® n ■ (P e 0 P e )ij(ij) • 7p ,n- P 

(3.43) 


t= 1 
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where we have used the fact that 7 P ,n- P commutes with any matrix of the form C ®P (g, £>®{n- P ) 

in the 

second equality, and then the trace property in the third equality As above, each of these terms reduces 
to a trace of a word, this time of the form 

2 

2^ETr(UP/VP^) 
l =1 


where 'll and V are random matrices in U 27 V (depending on p, n, i,j). Since | Pf \ < 1, the modulus of 
each term is < 2 N, giving an overall factor of < 8 N. Combining with the yjy in the integral, this gives 


2 N 


-ETY 


QiMitiS) ■ Ip,n-P 


du 


< 4s. 


(3.44) 


Hence, from (3.421 and (3.44|, the modulus of the desired covariance is bounded by 


22 [{t-s) + 2s]=p(n-p)(t + 3s), 

l<i<p<j<n 


yielding ( |3.41[ ). 


□ 


Lemma 3.13. For p E {1, ..., n} arid 0 < s < t, 

|Cov[Tr((Af )PP,), Tr((A 2 s N yP 1 )] \ < p(n - p) (t + 3 s ). 

Proof. From ( 3.40[) together with |3.26 1, the quantity whose modulus we wish to estimate is 


(3.45) 


£ 

l<i<p<_7<n 


t-s ^ 

-ETr 

2 N 


( ^’ ’ (-^l ® Pi )p.n ‘ r Yp,n—p 


du 


+ 


[ —ETV 

Jo 2iV 


(3.46) 


Qi 7 t'l ®) ' (-^1 ® Pi )p,n ‘ r )p,n—p 


du 


For the first term, we expand the integrand, commuting (if) past (Uflf_, l 
to give 

1 


as in the proof of Lemma 


3.12 


-ETr 

2 N 


(u™_ u r n ■ (k 27 t p ® ( i wt N r • (B 2 s N r n • (p 1 ® au • 7 p, n ~ P • (»j) 


As above, since 7 p, n -p ■ (*, j) is a single cycle, this trace reduces to a trace over C 2N , of a the form 

^ETrflTPiV'Px] 

where IX' and V' are random unitary matrices in U 27 V composed of certain i. j, p. n-dependent words in 
Ul!f_ u , Vf N , W 2N , and B 2N . As 11 Pi 11 < 1, it follows that this normalized trace is < 1, and so the first 
integral in ( 3.46| is < (t — s ) in modulus, as in ( 3.42| . 

Similarly, we expand the second term as in ( 3.43[ >, which gives the sum over £ E {1,2} of the 
expected normalized trace of 

C Vt- N s)® P ® (W t 2N s r (n ~ P) • (B 2 s N u )® n • 7 p,n-p(i,j)(Pe ® Pl)ij • (CD® P ® (- Dl N r {n ~ P) • (A ® Pl)p,n- 

As in the above cases, since 7 P , n - P (i,j) is a single cycle, this is equal to a single trace Tr(A] • • • A n ) 
where Ai, ..., A n belong to the set {Vpff. W 2N s , B 2N U , C 2N , D^ N , P^, Pi}. As each of these is either a 
random unitary matrix or a projection, it follows that the expected normalized trace has modulus < 1, 
and so the -^--weighted sum of 2 terms, each of modulus < 2 N, gives a contribution no bigger than 4. 
The remainder of the proof is exactly as the end of the proof of Lemma 3.12 □ 
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Finally, we are ready to conclude this section. 
Proof of Proposition |3.5| From ( 3.21 1, we have 


n— 1 

\ v t W - rf N (n )I < \ 

~' i «/ 0 


P=1 "'° 

Lemma [379] then gives 

\H P , q (s)\ < ^2 |Cov[Tr(( J 4^ iV ) p ), Tr(( J 4^ JV ) 9 )]| + ^ |Cov[IF((^) p P 1 ), TV((^)^P 1 )] | . 

Combining this with (3.41| and ( 3.45[> therefore yields 


\H p .,(s)\< P -f^-ft + 3sl 


Integration then gives 

- <^(">1 < = £ ^ ^ ^ Ep (» - rt - 


2^ N 2 8 

p=i 


The sum over p has the exact value g (n 3 — n) < ^ . The blunt estimate || < | yields the result. U 


p=i 
25 ^ 3 
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4 Strong Convergence 


In this section, we prove Theorem 1.4 We begin by showing (in Section BTJI that the eigenvalues of 


the unitary Brownian motion at a fixed time converge to their "classical locations", and we use this to 
prove that the unitary Brownian motion can be uniformly approximated by a function of a Gaussian 
unitary ensemble (for time t < 4). We then use this, together with Male's Theorem 2.3 to prove 
Theorem 11.41 


4.1 Marginals of Unitary Brownian Motion and Approximation by GUE^ 

Remark 4.1. Throughout this section, we will regularly use the following elementary fact: if (a N )^ =1 
is a sequence that possesses a convergent subsequence, and if all convergent subsequences have the 
same limit r, then linry^oc a N = r. 

We begin with the following general result on convergence of empirical measures. As usual, for 
a probability measure p on M, the cumulative distribution function F /t is the nondecreasing function 
F fl (x) = //((—oo, x]). If p has a density p, we may abuse notation and write F /; = F p . 

Proposition 4.2. For each N e N, let (x%)% =1 be points in M with < ■ ■ ■ < xfj. Let p N = jj i $ X N 
be the associated empirical measures. Suppose the following hold true. 

(1) There is a compact interval [a_,a + ] and a continuous probability density p with suppp = [a_,a + ], so 
that, with p(dx) = p(x) dx, we have p N —^ p weakly as N —> oo. 

(2) xf 1 —t a_ and xfj -> a + as N —»• oo. 

For r G [0,1], define x*(r) = T)y 1 (r) if r G (0,1), and x*(0) = o_, x*(l) = a + . Then, for any sequence 
(k(N))x =1 with k(N) G {1,..., N} and lim k(N)/N = r, we have 

N—»oo 

\im o x ^ N )= x *( r ). 

Proof. Since v N —^ v, it follows that F /r v (x) —> F t fx) for any x that is a continuity point of F /( ; since F ; , 
is continuous everywhere, we therefore have /yv —> F pointwise on M. As p is compactly supported, 
it now follows by a variant of Dini's theorem (cf. Il39l Problem 127 Chapter II]) that F p n F uniformly 
Now, let k(N) be a sequence as stated, and set 

V N = x k(N )> and a N = F fl (y N ). 

Since 1 < k(N) < N, and since the points xf are ordered, we have x± < y N < xf. Therefore 
item (2) shows that (y N ) is a bounded sequence, and hence possesses convergent subsequences. Fix 
any convergent subsequence {y Nrn )ff =1 ' r then, since F /( is continuous, [a N,n ) is also convergent. Since 
F p Nm -> F /( uniformly, it follows that 

| a Nrn - F^Nm ( y Nm )\ = \F p (y Nm ) - F p Nm (y Nm )\ -t 0 as m -t oo. 

Note that, for any n G N, since F p n is the normalized counting measure of the points (x]l)f =l , we have 

F^(y n ) = FA x k(n)) = l it, l ^ x k ^ x k(n)} = ^ r as n ^ OO. 

n k =1 
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Hence, we have shown that a Nn 
conclude (cf. Remark [4d]l that 


The limit r does not depend on the subsequence, and so we 


r = lim a = lim Xy m ). 

TV—>00 TV-/00 { > 


Now, let 0 < r < 1. Since // has a positive density on [0,1], the function /y is continuous and strictly 
increasing on [a_, a+], and so has a strictly increasing continuous inverse function F~ 1 on [0,1]. Thus, 


it follows that x 


TV 


= F, 


k(N) ~ - L 'H 1 ( aJV ) 


F 1 (r) = x*(r ) as claimed. 


On the other hand, suppose r = 0. Then for 0 < e < 1, for all large N, k(N ) < eN. Thus 
Xi < y N < x^ N . By the preceding paragraph, we know x^ N —> x*{e), and by (2) x± —> a_ = x*(0); 
thus x*(0) <y N < x*(e) for all large N. Since x*(e) = F~ 1 (e) and F ~ 1 is continuous on [0,1], it follows 
that x* (e) -» x* (0) as e | 0. So x^ N -. = y N -» x* (0) by the Squeeze Theorem. An analogous argument 
completes the proof in the case r = 1. □ 

For example, if (x/f )j? =1 are the ordered eigenvalues of a CUE v , then Wigner's law (and the corre¬ 
sponding h ard edge theorem) show that the empirical spectral distribution satisfies the conditions of 
Proposition |4.2| almost surely, where the limit measure is the semicircle law // = a\ = y^4 — x 2 ) + dx. 

In particular, when —> r, we have —> F~^{r). These values are sometimes called the classical 

locations of the eigenvalues. In the case of a GUE ;V , much more is known; for example, ff23l showed 


that the eigenvalues have variance of in the bulk and 0(N 4// ' 3 ) at the edge, and further stan¬ 


dardizing them, their limit distribution is Gaussian. For our purposes, the macroscopic statement of 
Proposition |4.2|will suffice. 


Now, fix t e [0,4)- From Theorem 2.5 the law v t of the free unitary Brownian motion it/ has an 
analytic density Qt supported on a closed arc strictly contained in Ui, and has the form Qt(e lx ) = pt(x) 
for some strictly positive continuous probability density function p t : (— n, tt) M which is symmetric 
about 0 and supported in a symmetric interval [—o(t), a(t)] where a(t ) = \y/t( 4 — t ) + arccos(l — t/2), 
cf. (2.121. For 0 < r < 1, define that classical locations v*(t. r ) of the eigenvalues of unitary Brownian 


motion as follows: 


v*(t,r) = exp (iFp^r)) , 
and also set v*(t, 0) = e~ ia ^ and v*(t, 1) = e ia ^\ 

Corollary 4.3. Let 0 < t < 4, and let V t N be a random unitary matrix distributed according to the heat kernel 
onUiy at time t (i.e. equal in distribution to the t-marginal of the unitary Brownian motion Ujf T ). Enumerate 


the eigenvalues ofV t N as vf (t ),..., in increasing order of complex argument in (—tt, n) 

sequence (&(1V))*? =1 with k(N ) € {1,..., N} and lim k(N)/N = r, we have 

TV—/ 00 


Then for any 


T V limo^Tv ) (t) = u*(f,r) a.s. 


Proof. Let xf (t) = —i lo g vf (t), where we use the branch of the logarithm cut along the negative real 
axis. Note: by Theorem |l.l| for sufficiently large N, vjf (t) are outside a f-dependent neighborhood of 
—1, and so the log function is continuous. The empirical law of { i;jf (t): 1 < k < N < 00 } converges 
weakly a.s. to u t (cf. ( |1.2| ), and so by continuity, the em pirical measure of {x^f (t): 1 < k < N < 00 } 

ia(t) 

(r) 


1.1 


shows that Vi(t) —> e m ^ and v^(t) 


4.2 


(t) 


converges a.s. to the density p/. Moreover, Theorem 

a.s., and so x^(t) —> — a(t ) while x^(t) a(t) a.s. Hence, by Proposition 
whenever k(N)/N —> r E (0,1), while the sequence converges to +u(t) when r = 0,1. Taking exp(i- 
of these statements yields the corollary. □ 


-TV 

l k(N) 


F- 1 

pt 
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Now, let us combine this result with the comparable one for the GUE ;V . Let gt: M —> M be given by 
gt = F~ 1 o F rj] ; this is an increasing, continuous map that pushes o\ forward to pt- Define ft'. M -» HJi 
by 

ft = exp(ig t ), v t = (ft)*(a 1 ). (4.1) 

The main result of this section is that, rather than just pushing the semicircle law forward to the law 
of free unitary Brownian motion, g t in fact pushes a GUE v forward, asymptotically, to V t N (for fixed 

t G [0,4)). 

Proposition 4.4. Let 0 < t < 4, and let V t N be a random unitary matrix distributed according to the heat 
kernel on Ujv flf time t (i.e. equal in distribution to the t-marginal of the unitary Brownian motion U f N ). There 
exists a self-adjoint random matrix X N with the following properties: 

(1) X N is a GUE A \ 


(2) The eigenvalues of X N are independent from V t N , and { V t N , X N } have the same eigenvectors. 

(3) || f t (X N ) - V^|| Mjv -t 0 a.s. as N > oo. 


Proof. Let (v jf denote the eigenvalues of V t N in order of increasing argument in (— it, tt), as 

in Corollary 4.3 It is almost surely true that uf'(i) < • • • < v^(t), and so we work in this event 


only. Let denote the eigenspace of the eigenvalue vf (t). This space has complex dimension 1 
a.s., and so we may select a unit length vector Ejf from this space, with phase chosen uniformly at 


random in Ui, independently for each of Eg ,..., Eg. Then, by orthogonality of distinct eigenspaces, 
the random matrix E N = [ Eg ■■■ Eg ] is in U/y; what's more, since the distribution of V t N is invariant 
under conjugation by unitaries, it follows that E N is Haar distributed on U,v. 

Now, for each N, fix a random vector pf = (// , x ...., /if) independent from V t N with joint density 
fu n(x i,..., xj\r) equal to the known joint density of eigenvalues of a GUE V , i.e. proportional to 


N 

fp N (xi,...,x n ) ~ Yi e ~ Xj n I x i~ x k 

j =1 1 <j<k<N 


Then we define 


X N = E n 


T\ 0 
0 


0 0 


Pn 


(E 


N\* 


It is well known (cf. QlIlZl) that the distribution of X N is the GUE V , verifying item (1). Item (2) holds 
by construction of X N . It remains to see that (3) holds true. Note, since operator norm is invariant 
under unitary conjugation, we simply have 


\\ft(X N )-V t N \\ MN 


max | ft(Pk)~ v f 
i<k<N U yrk J 3 


I ft(Pk(N)) v k{N) 


(4.2) 


where k(N) is an index in {1,.... N} that achieves the maximum. The quantity in (4.21 has modulus 
< 2 for each N, and so there are convergent subsequences. We will show that all convergent subse¬ 
quences converge to 0, which proves (3) (cf. Remark |4.1| l. 

Hence, reindexing as necessary, we assume that \ft(p^ N )) ~ v k(N) I converges. Now, the sequence 
k(N)/N is contained in (0,1], and hence it has a convergent subsequence with some limit r G [0,1]. 
Again reindexing, we still denote this subsequence as k(N). By Proposition |4.2| in the case of the 


27 






GUE ^ -eigenvalues, we know that Ff \' ( r ) (where this is interpreted to equal ±2 when r = 

0,1). Then by definition (and continuity) of ft, exp {iF~ l {r)) = v*(t, r ). By Corollary |4.3j 

we have (t) —y v* (t, r) as well. This shows the limit of the difference is 0. 

Thus, given any convergent subsequence of \\ft(X N ) — there is a further subsequence that 

converges to 0. It follows that every convergent subsequence of \\ft(X N ) — V t N |i,i v has limit 0, and so 
we conclude that it converges to 0, as claimed. □ 


4.2 Strong Convergence of the Process (U^) t >o 

Since the Gaussian unitary ensemble is selfadjoint, we may extend Male's Theorem |2.3| to continuous 
functions in independent GUE v s. 


Lemma 4.5. Let A a = (Af ,..., ) be a collection of random matrix ensembles that converges strongly to 

some a = (ai,..., a n ) in a W*-probability space (g/, r). Let X N = (X^, ..., Xff) be independent Gaussian 
unitary ensembles independent from A. N , and let x = (xi,..., x k ) be freely independent semicircular random 
variables in srf all free from a. Let f = (/i,... ,f k )' M C k be continuous functions, and let f(X JV ) = 
• • • > fk(Xjf)) and f(x) = (/i(xi),..., f k {x k )). Then (A iV , f(X jV )) converges strongly to (a, f(x)). 


Proof. We begin with the case k = 1. If p is any polynomial, by Theorem 2.3 (A x ,p(Xf)) converges 


-JV^ 


strongly to (a,p(xi)) by definition. Now, let e > 0, and fix a noncommutative polynomial P in n + 1 
variables. Then P( a, y) is a finite sum of monomials, each of the form 


Qo(a)yQi(a)y • • • Q d _i(a)yQ d (a) 


for some noncommutative polynomials Qo, - ■ ■ ,Qd and nonnegative integers d. Let dp be the "degree" 
of P: the maximum number of Qk(' A ) terms that appears in any monomial in the above expansion 
of P(a, y). Let M = 1+ the sum of all the products ||Qo(a)|| ■ ■ • ||Qd(a)|| over all monomial terms 
appearing in P. By the Weierstrass approximation theorem, there is a polynomial p in one variable so 
that 

lip - /illpoo [_2 2 ] < — - , n 7 I, wT- (4-3) 

1 11 8dpM(l + \\fi\\ L o°[- 2 t 2 ]) dp 

It follows that, for small enough e > 0, we also have ||p||l°°[- 2 , 2 ] C 1 + ||/i|| x,°°[— 2 , 2 ]- Now we break up 
the difference in the usual manner, 

|||P(A JV ,/ 1 (A 1 7V ))||-||P(a,/ 1 (x 1 ))||| < |||P(A 7V ,/ 1 (A 1 7V ))||-||P(A Ar ,p(Af))||| 

+ | l|P(A iV ,p(Xf r ))|| — ||P(a,p(x 1 ))|| | (4.4) 

+ |||P(a,p(x 1 ))||-||P(a,/(x 1 ))|||. 

By the known strong convergence of (A' V .p(A'j v )) to (a,p(xi)), the middle term is < | for all suffi¬ 
ciently large N. For the first and third terms, we use the reverse triangle inequality; in the third term 
this gives 

|||P(a,p(xi))|| - ||P(a,/(xi))||| < ||P(a,p(xi)) -P(a,/(xi))||. 

Let y = p(x 1 ) and z = fi(xi). We may estimate the norm of the difference using the triangle inequality 
summing over all monomial terms; then we have a sum of terms of the form 

||Qo(a)yQi(a)y • • • Q d _i(a)yQ d (a) - Q 0 (a)zQi(a)z ■ ■ ■ Q d _i(a)zQ d (a)||. (4.5) 
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By introducing intermediate mixed terms of the form Qo(a)y ■ ■ ■ Qk-i{a)yQk{a)z ■ ■ ■ Qd-i (a)zQd(a) to 
give a telescoping sum, we can estimate the term in ( |4.5| by 

d 

IIOo(a)|| ■ ■ ■ ||<3„(a)|| Y, MMrfTl® - *||. (4.6) 

k= 1 


Since ||y|| = ||p(®i)|| = \\p\\ L °°[- 2 , 2 ] < 1 + ||/i|U«[- 2 , 2 ] and INI = ll/i(*i)ll < 1 + II/i||l“[- 2 , 2 ]/each term 
in the previous sum is bounded by 


\\y\ 


k—1 1 


I d—k 


<(i + 


[— 2 , 2]) rf_1 < (1 + 


3 [- 2 , 2 ]) 


dp 


Since ||y — z\\ = ||p — /i||l°°[- 2 . 2 ]/ combining this with 
all large N. 


shows that the third term in (4.41 is < f for 


The first term in (4.41 is handled in an analogous fashion, with the caveat that the prefactor in 


is replaced by 11 Qq (.A 


N\ 


11 Qd (A A ) 11. Here we use the assumption of strong convergence of A 


N 


to show that, for all sufficiently large N, 


||Qo(A JV )|| • • • \\Qd(A N )\\ < max{l, 2 • f|Q 0 (a)|| • • • ||Q d (a)||}. 


Then we see that the first term in ( |4.4| is < I for all large N, and so we have bounded the sum < e for 
all large N, concluding the proof of the lemma in the case k = 1. 

Now suppose we have verified the conclusion of the lemma for a given k. We proceed by induction. 
Taking (A v . f\ (Xf ),..., fk(Xjf)) as our new input vector, since f k +i(Xjf +1 ) is independent from all 
previous terms, the induction hypothesis and the preceding argument in the case k = 1 give strong 
convergence of the augmented vector (A^, f\ (Xf ),..., f k (XjJ ). f k+] (X^ ,)) as well. Hence, the proof 
is complete by induction. □ 


This finally brings us to the proof of Theorem |1.4| 

Proof of Theorem [L4| As above, let A a = (Af ,..., Af) and let a = (ai,..., a n ) be the strong limit. By 
reindexing the order of the variables in the noncommutative polynomial P appearing in the definition 
of strong convergence, it suffices to prove the theorem in the case of time-ordered entries: U { N 


ti j • ■ 


tj n 

' u t k 


with t\ < t 2 < ■ ■ ■ < tk■ What's more, we may assume without loss of generality that the time 
increments s\ = t\, S 2 = t 2 — fa,..., s k = t k — t k _i are all in [0,4). Indeed, if we know the theorem holds 
in this case, then for a list of ordered times with some gaps 4 or larger, we may introduce intermediate 
times until all gaps are < 4; then the restricted theorem implies strong convergence for this longer list 
of marginals, which trivially implies strong convergence for the original list. 

As discussed in Section 


Now, set Vjf = U t N , and V Q 


N 


- T N 

" si '- y ii / clAlv - A v Sj 

increments of the process are independent, and V s f has the same distribution as Uf. Hence, 
sition 


4.4 


II fsAX! 


~N~ 


J 


= (Ut^yU* for 2 < j < k 

A has the same distribution as Uf 
Xff, and continuous functions f. 


2.1 


by 


there are k independent GUE ;V s X[ 


r N 


Sj 


- K 


N | 


these 
’ropo- 
C, so that 

’-N 


—> 0 as N —> oo. Since the Vjj are all independent from A A , so are the Xj . Hence, 
by Lemma 4.5 taking x\,... ,x k freely independent semicircular random variables all free from a, it 
follows that 


(A jV , f si (Xf ),..., f Sk (X k )) converges strongly to (a, f si (x i), - - -, f Sk (x k )). 

By the definition of the mapping f s (cf. <E3), f Sj (. Xj ) has distribution u Sj/ and as all variables in sight 
are free, (a,f Sl (xi),..., f Sk (xk)) has the same distribution as {&,v Sll .. .,v Sk ) where (u s ) s >o is a free 
unitary Brownian motion, freely independent from a. 
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It now follows, since ||/ s , (X^) - Vj 


0, that 


(A^, V*, .... V *) converges strongly to (a, v Sl ,...,v 


rN 


s k / 


(The proof is very similar to the proof of Lemma 4.5 ) Finally, we can recover the original variables 


U t N = V*V* ■ ■ ■ V™. Therefore 


(A N ,u£,...,ull) converges strongly to (a, v Sl , v Sl v S2 ..., v Sl v S2 ■ ■ ■ v s J. 


tN 


The discussion at the end of Section 2.4 shows that (v sl ,v Sl v S2 ,..., v s , v S2 ■ ■ ■ v Sk ) has the same distribu¬ 
tion as (u^jUtz,... ,ut k ) where (ut)t >o is a free unitary Brownian motion in the IL*-algebra generated 
by ('f’.s).s>o, and is therefore freely independent from a. This concludes the proof. □ 
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5 Application to the Jacobi Process 


In this final section we combine our main Theorem |1.4| with some of the results of our earlier paper Ifl2l . 
to show that the Jacobi process (cf. ( |5.1| and ( |5.2[ )) has hard edges that evolve with finite propagation 
speed. 

There are three classical Hermitian Gaussian ensembles that have been well studied. The first is the 
Gaussian Unitary Ensemble described in detail above, whose analysis was initiated by Wigner It50l and 
began random matrix theory. The second is the Wishart Ensemble, also known (through its applications 
in statistics) as a sample covariance matrix. Let a > 1, and let X = X A be an N x [aN\ matrix all of 
whose entries are independent normal random variables of variance then W = XX* is a Wishart 
ensemble with parameter a. As N —> oo, its empirical spectral distribution converges almost surely 
to a law known as the Marchenko-Pastur distribution; this was proved in 136] . As with the Gaussian 
Unitary Ensemble, it also has a hard edge, and the largest eigenvalue when properly renormalized has 
the Tracy-Widom law. 

The third Hermitian Gaussian ensemble is the Jacobi Ensemble. Let W a and W' b be independent 
Wishart ensembles of parameters a,b > 1. Then it is known that W a + Wj is a Wishart ensemble of 
parameter a + b, and is a.s. invertible (cf. IflTl Lemma 2.1]). The associated Jacobi Ensemble is 

J = Ja,b = (W a + W' h )-\W a (W a + (5.1) 


Such matrices have been studied in the statistics literature for over thirty years; they play a key role 
in MANOVA (multivariate analysis of variance) and are sometimes simply called MANOVA matrices. 
The joint law of eigenvalues is explicitly known, but the large- N limit is notoriously harder than the 
Gaussian Unitary and Wishart Ensembles. In IfTTH . the present first author made the following discov¬ 
ery which led to a new approach to the asymptotics of the ensemble: its joint law can be described by 
a product of randomly rotated projections, as follows. (For the sake of making the statement simpler, 
we assume a, b are such that aN and bN are integers.) 

Theorem 5.1 (HU, Theorem 2.2). Let J a j, = jff b be an N x N Jacobi ensemble with parameters a,b > 1. 
Let P,Q& M ( a +b)N be (deterministic) orthogonal projections with rank(P) = bN and rank(Q) = N. Let 
U E U [ a +b)N be a random unitary matrix sampled from the Haar measure. Then QU*PUQ, viezved as a 
random matrix in Mat via the unitary isomorphism Mat = QM( a+b ) A r(5, has the same distribution as J a ^. 

Given two closed subspaces ¥, W of a Hilbert space H, if P: H —» V and Q: H —» W are the or¬ 
thogonal projections, then the operator QPQ is known as the operator valued angle between the two 
subspaces. (Indeed, in the finite-dimensional setting, the eigenvalues of QPQ are trigonometric poly¬ 
nomials in the principal angles between the subspaces V and W.) Thus, the law of the Jacobi ensemble 
records all the remaining information about the angles between two uniformly randomly rotated sub¬ 
spaces of fixed ranks. These observations were used to make significant progress in understanding the 
Jacobi Ensemble in statistical applications (cf. 11291 ). and to generalize many of these results to the full 
expected universality class (beyond Gaussian entries) in the limit (cf. 12TI ). 

In terms of the large-A limit: letting a = ^ and (3 = we have trP = a and trQ = /3 
fixed as N grows, and therefore there are limit projections p. q of these same traces. The chosen Haar 
distributed unitary matrices converge in noncommutative distribution to a Haar unitary operator u 
freely independent from p, q, and so the empirical spectral distribution of J a .h converges to the law 
of qu*puq, which was explicitly computed in Il49l as an elementary example of free multiplicative 
convolution: 


Law, 


qu*puq 


= (1 — ruin {a, J3})8q + maxja + j3 — 1, 0}<5i + 


y/(r+ ~ x)(x - r_) 

27TX’(1 — x) 


-f [r_ ,r+] dx. 
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where r± = a + (3 — 2a(3 ± 2^aB(\ — a)(l - 8). Furthermore, it was shown in 11291 that the Jacobi 
Ensemble has a hard edge, the rate of convergence of the largest eigenvalue is N ~ 2 / 3 (as with the 
Gaussian Unitary and Wishart Ensembles), and the rescaled limit distribution of the largest eigenvalue 
is the Tracy-Widom law of 144| . 

Simultaneously to these developments, Voiculescu l48l introduced free liberation. Given two sub¬ 
algebras A, B of a VU-probability space {&£ , r) and a Haar unitary operator u <E .V that is freely in¬ 
dependent from A, B, the rotated subalgebra u*Au is freely independent from B. If (ut)t> o is a free 
unitary Brownian motion freely independent from A, B, it is not generally true that u* t Aut is free from 
B for any finite t (in particular when t = 0 we just have A. B), but since the (strong operator) limit 
as t — > oo of ui is a Haar unitary, this process "liberates" A and B. This concept was used to define 
several important regularized versions of measures associated to free entropy and free information 
theory, and to this days plays an important role in free probability theory The special case that A, B 
are algebras generated by two projections has been extensively studied IIT5lfl6 ,17,18,19, 125ll26ll27l . as 
the best special case where one can hope to compute all quantities fairly explicitly 

In the first and third authors' paper [12, Section 3.2], the following was proved. 

Theorem 5.2 ( Ifl2l . Lemmas 3.2—3.6). Let p, q be orthogonal projections with traces a, (3, and let (ut)t> o be a 
free unitary Brownian motion freely independent from p. q. Let p t = Law qu * putq . Then 

pt = (1 — min {a, /3})<5 0 + maxja + (3 — 1, 0}<5i + p t 


where Jit is a positive measure (of mass min ja,/3} — maxja + J3 — 1,0}). Let I\,l 2 be two disjoint open 
subintervals of ( 0,1). If supp ]2t 0 CjU U for some to > 0, then supp/Jj C I\ U U for \t — to I sufficiently 
small; moreover, Jlfl i) and Jit(l 2 ) do not vary with t close to to. 

If Jit has a continuous density on (0,1 ) for t > 0, and xt 0 G (0,1) is a boundary point of supp Jlt Q , then for 
1 1 — to| sufficiently small there is a C 1 function t i-t x(t ) with x(to) = xt 0 so that x(t) is a boundary point of 
supp Ji fo . 

Finally, in the special case a = (3 = \,for all t > 0, pt possesses a continuous density zvhich is analytic on 
the interior of its support. 


Remark 5.3. (1) It is expected that the final statement, regarding the existence of a continuous density, 

holds true for all a, (3 £ (0,1); at present time, this is only known for a = (3 = \. Nevertheless, 
the "islands stay separated" result holds in general. 


(2) Our method of proof of the regularity of Jit involved a combination of free probability, complex 
analysis, and PDE techniques. In the preprint l28l . the authors partly extended this framework 
beyond the a = f3 = \ case; but they were still not able to prove continuity of the measure. They 
did, however, give a much simpler proof of the result in the case a = (3 = \. here. Jit can be 
described as the so-called Szego transform (from the unit circle to the unit interval) of the law of 
vout, where t’o is the inverse Szego transform of the law of qpq. Via this description, the regularity 
result is an immediate consequence of Theorem 2.5 above. 


(3) Let us note that a = (3 = \ corresponds to a = b = 1, meaning the "square" Jacobi ensemble. 
This is, of course, the case that is least interesting to statisticians: in MANOVA problems the data 
sets are typically time series, where there are many more samples than detection sites, meaning 
that a, b 1. In fact, it is generally questioned whether the Jacobi Ensemble is a realistic regime 
for real world applications, rather than building the Wishart Ensembles out ofJVxM Gaussian 
matrices where —> oo. 
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Thus, it is natural to consider the corresponding finite-i deformation of the Jacobi Ensemble. The 
matrix Jacobi process associated to the projections P N , Q N G Mjv, is given by 


J t N = Q N (U t N )*P N U t N Q N 


(5.2) 


where {U^)t >o is a Brownian motion in Uat. (Typically P N , Q N are deterministic; they may also be 
chosen randomly, in which case U f N must be chosen independent from them.) This is a diffusion 
process in it lives a.s. in the space of matrices M 6 Mjv with 0 < M < 1 (i.e. M is self-adjoint 

and has eigenvalues in [0,1]). Note that the initial value is Jq = Q N P N Q N , the operator-valued angle 
between the subspaces in the images of P N ,Q N . In particular, the Jacobi process records (through 
its eigenvalues) the evolution of the principal angles between two subspaces as they are continuously 
rotated by a unitary Brownian motion. 

In the case N = 1, the process ( |5.2| precisely corresponds to what is classically known as the 
Jacobi process: the Markov process on [0,1] with generator d£ = x(x — — (cx + d) jk, where 

c = 2 rninja, /3} — 1, d = \a — /3\. This is where the name comes from, as the orthogonal polynomials 
associated to this Markov process are the Jacobi polynomials, cf. Ifl9l . 


Remark 5.4. Comparing to Theorem 5.1 we have now compressed the projections and the Brownian 
motion into from the start. We could instead formulate the process as in that theorem by choosing 
projections and Brownian motion in a larger space, which would have the effect of using a "corner" 
of a higher-dimensional Brownian motion instead of . While this makes a difference for the finite¬ 
dimensional distribution, it does not affect the large-lV behavior. 


This brings us to our main application. First note that, from our main Theorem 1.4 the Jacobi 
process converges strongly. 


Corollary 5.5. Let P N , Q N be deterministic orthogonal projections in Mjv, and suppose {P N , Q N } converges 
strongly to {p, q}. Let (ut)t>o be a free unitary Brownian motion freely independent from p, q. Then for each 
t > 0 the Jacobi process marginal J f N converges strongly to j t = qujput.q. What's more, if f £ C[0,1] is any 
continuous test function, then ||/(J t JV )|| —> ||/(jt)|| a.s. as N — > oo. 


Proof. The strong convergence statement is an immediate corollary to Theorem 1.4 with Af = P 


jN 


Af = Q N , and n = 2, k = 1. The extension to continuous test functions beyond polynomials is then 
an elementary Weierstrass approximation argument. □ 


Example 5.6. For fixed k G N, select two orthogonal projections P. Q G M&. Then define P N , Q N G M/,/v 
by P N = I’ ® 1 v and Q N = Q ® I x . (Here we are identifying M/, : ® = MkN via the Kronecker 

product.) If F is a noncommutative polynomial in two variables, then 

F(P n ,Q n ) = F(P,Q)®I n 


and it follows immediately that {/ j V . Q ;V } converges strongly to [P. Q} (i.e. the W* -probability space 
can be taken to be (M^, tr)). Expanding this space to include a free unitary Brownian motion freely 
independent from {P, Q} and setting j t = QujPutQ, Corollary [ 5 ] yields that the Jacobi process Jf N 
with initial value Q N P N Q N converges strongly to jt. 

Figure [ 2 ] illustrates the eigenvalues of jf N with k = 4, N = 100, and initial projections given by 


P = 


0.2 0.4 
0.4 0.8 


1 0 
0 0 


+ 


0.8 0.4 
0.4 0.2 


0 0 
0 1 


Q 


1 0 
0 0 


< 8 > I 2 


which have been selected so that the initial operator-valued angle QPQ has non-trivial eigenvalues 0.2 
and 0.8; this therefore holds as well for Q N P N Q A for all N. This implies that the subspaces P N (C kN ) 
and Q N {C kN ) have precisely two distinct principal angles. 
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Figure 2: The spectral distribution of the Jacobi process Jf A of Example 5.6 with k = 4, N = 100 and 


times t = 0.01 (on the left) and t = 0.25 (on the right). The histograms were made with 1000 trials each, 
yielding 4 x 10 5 eigenvalues sorted into 1000 bins. 


As is plainly visible in Figure [2j for small time, the eigenvalues (which are fixed trigonometric 
polynomials in the principal angles) stay close to their initial values. That is: despite the fact that the 
diffusion's measure is fully supported on M^’ 1 ^ for every t, N >0, the eigenvalues move with finite 
speed for all large N. That is our final theorem. 

Theorem 5.7. For each N > 1, let ( U t N )t>o be a Brownian motion on U n, let Y N and. W N be subspaces of C N , 
and suppose that the orthogonal projections onto these subspaces converge jointly strongly as N —> oo. Suppose 
there is a fixed finite set 6 = {9 \,..., 9k} of angles so that all principal angles between Y N and W N are in 6 for 
all N. Then, for any open neighborhood 6 of 6, there is a T > 0 so that, for 0 < t <T, it is a.s. true that for all 
large N, all principal angles between (V N ) and W N in fJ. 

Proof. Let P N and Q' be the projections onto V N and W N . Then there is a fixed list A = (Ai,..., A/,} in 
[0,1] so that all eigenvalues of Q N P N Q N are in A. (The eigenvalues A j are certain fixed trigonometric 
polynomials in 6). Let jf be the Jacobi process associated to P N ,Q N , and let j t be the associated 
larged limit. By Corollary[5j for any t > 0 and any / € CJ0,1], ||/(J/ V )|| —> ||/(jt)|| a.s. as N —> oo. 

Applying this at time t = 0, let A*, A j e A with A, < A j such that no elements of A are in the interval 
( Xi,Xj ). Now let / be a continuous bump function supported in (A*, Ay). Then /(./ 0 V ) = 0, and it 
therefore follows that ||/(jo)|| = 0. As this holds for all bump function supported in (A*, A j), it follows 
that spec(jo) does not intersect (A,. Xj). Thus jo has pure point spectrum precisely equal to A. 

Now, fix any e > 0; by (induction on) Theorem |5.2j for sufficiently small t > 0, spec(jJ) is contained 
in \ e (the union of e-balls centered at the points of A). Now, suppose (for a contradiction) that, for 
some No, .// Vo possesses an eigenvalue A G (0,1) \ A, . Let g be a bump function supported in (0,1) \ A e 
that is equal to 1 on a neighborhood of A; then \\g{ jj V °)\\ > 1. But, by Corollary |ij we know \\g( J t N ) || —> 
Il5(it)ll = 0 a - s - as N —> oo. Thus, for all sufficiently large N, ||g , (J / Ar )|| < 1, which implies that A is 
not an eigenvalue. As this holds for any point in (0,1) \ A f , it follows that spec (J^) is almost surely 
contained in A, for all sufficiently large N. 

The result now follows from the fact that the principal angles between Ujf (V) and W are fixed 
continuous functions (trigonometric polynomials) in the eigenvalues of J t N . □ 
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